10 research outputs found

    Overview of BioCreative II gene mention recognition.

    Get PDF
    Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions

    Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

    Get PDF
    Background: Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results: We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion: We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation

    Industrial requirements for ML application technology

    No full text
    this paper, it is clear that we have only a very rudimentary idea of a methodology for ML. On might ask why the ML community has made so little progress in comparison to e.g. the database community. There ara standard methods for the design and implementation of a database. Such things are not yet realised for ML. The main reason may be that a methodology for ML is much harder because there are fundamental philosophical issues on a more abstract level that have not been solved. A general theory of scientific heuristics, that would be a basis for such a methodology is missing. Many issues concerning a methodology for ML are really basic issues in methodology of science in disguise. My general feeling is that it will be difficult to formulate a methodology for ML is the underlying general patterns of scientic heuristics and creativity are not well understood. Recent work in complexity theory, and methodology of science may help us here

    Computational grammatical inference

    No full text
    Machine learning is currently one of the most rapidly growing areas of research in computer science. This book covers the three main learning systems; symbolic learning, neural networks and genetic algorithms as well as providing a tutorial on learning casual influences. Each of the nine chapters is self-contained.17 page(s

    AI on the ocean: the robosail project

    No full text
    Abstract. Sailing represents a new area of research within the field of applied AI. This paper gives a global overview of an ongoing broad-scoped research programme, called the RoboSail project (www.robosail.com). Part of this project is the realization of a semiautonomous intelligent pilot and a decision support system, assisting sailors in optimizing sailboat performance. The project encompasses a wide range of various real-world AI problems. In solving these problems, we argue that our approach to symbol grounding by means of using jargon amounts to a significant reduction in the problem’s complexity. The first implementation of this system has been successfully tested by some of the leading sport teams in the world of short-handed sailing.

    Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

    No full text
    Abstract Background Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation.</p
    corecore